Explainable Machine Learning


Advanced Programming for Data Science

     

Lecturer: Dr. HAS Sothea

Content

  • Motivation & Introducion
    • Interpretable ML vs Explainable ML
  • Interpretable Models:
    • Linear Models
    • Decision Trees
  • Explainable Boosting Machine (EBM)
    • Bootstrap Aggregating: Bagging
    • Random Forest/Extremely Randomized Trees
    • Boosting: Adaboost & XGBoost
  • Model-Agnostic Explanation Methods
    • LIME: Local Interpretable Model-Agnostic Explanations
    • SHAP: SHapley Additive exPlanations
    • Feature Importance: Permutation Importance…

Motivation & Introducion

Open the “black box” of ML models

The “Black Box” Problem

  • Modern AI/ML models (like Deep Learning) are powerful but complex.
  • They often act as “Black Boxes”: we see the input and the output, but not the internal decision-making process.
  • In practice, this lack of transparency can lead to:
    • Mistrust from users and stakeholders.
    • Difficulty in debugging and improving models.
    • Challenges in regulatory compliance, especially in sensitive areas like healthcare and finance.
  • Therefore, explainability & interpretability may be more important than accuracy in some contexts.

Explainable Machine Learning (XML)

  • Explainable Machine Learning (XML) is a subfield of ML that focuses on developing models and techniques that provide insights into how and why certain predictions or decisions are made by the model.
  • The goal of XML is to make machine learning models more transparent, interpretable, and understandable to humans

Goals of Explainable ML

  • Transparency: Making the inner workings of ML models visible and understandable.
  • Trust: Building confidence in ML models by providing clear explanations for their predictions.
  • Accountability: Enabling stakeholders to hold models accountable for their decisions.
  • Debugging: Helping developers identify and fix issues in ML models.
  • Regulatory Compliance: Ensuring that ML models meet legal and ethical standards for explainability.

Interpretable ML vs Explainable ML

Interpretable ML

  • Interpretable ML focuses on building models that are inherently understandable by humans.
  • These models are designed to be simple and transparent, allowing users to easily grasp how predictions are made.
  • Examples of interpretable models include:
    • Linear Regression
    • Decision Trees
    • Rule-Based Models
  • The main advantage of interpretable models is that they provide clear insights into the decision-making process.

Interpretable ML vs Explainable ML

Explainable ML

  • Explainable ML focuses on providing explanations for complex, “black box” models that are not inherently interpretable.
  • These models may achieve higher accuracy but lack transparency.
  • Explainable ML techniques aim to shed light on how these models make predictions, often through post-hoc analysis.
  • Examples of explainable ML techniques include:
    • LIME (Local Interpretable Model-Agnostic Explanations)
    • SHAP (SHapley Additive exPlanations)
    • Feature Importance
  • The main advantage of explainable ML is that it allows users to understand and trust complex models.

Summary

Interpretable ML vs Explainable ML

Aspect Interpretability Explainability
Focus How the model works internally (inner mechanics and logic). Why a specific decision or prediction was made (the rationale or justification).
Transparency Inherent transparency in model design (“white box” models like linear models or decision trees). Often achieved through post-hoc (after-the-fact) methods applied to complex “black box” models (like deep learning).
Goal Global understanding of the entire model’s operation. Local explanations for individual predictions, which can be aggregated for general insights.
Methods Use of simple, inherently interpretable models. Use of techniques like LIME, SHAP, and Feature Importance to explain complex models.
Target Audience Primarily data scientists and developers for debugging and improvement. End-users, stakeholders, and regulators to build trust and ensure compliance.

I. Interpretable Models

Model is Interpretable by Design

  • EDA is crucial to understand data before modeling:
    • Identify problems (take note) for correct data preprocessing…
    • Detect patterns and relationships (input-output) for feature selection and engineering…
    • Guide model choice…
  • Model choice: interpretable models like Linear Models and Decision Trees are preferred when interpretability is a priority.
  • Interpretability: based on model structure and parameters.

1. Linear Models

  • The connection between input \(X\) and output prediction \(y\) is explicitly linear in parameters.
  • Coefficients indicate the strength and direction of relationships.

Regression Models:

  • Linear Regression: \[\hat{y} = \color{blue}{\beta_0} + \color{blue}{\beta_1}\text{x}_1 + \color{blue}{\beta_2}\text{x}_2 + ... + \color{blue}{\beta_d}\text{x}_d.\]

  • Polynomial regression: \[\hat{y}=\color{blue}{\beta_0}+\sum_{j=1}^d\color{blue}{\beta_j}\text{x}_j+\sum_{k=1}^{\color{red}{p}}\sum_{\ell,m=1}^d\color{blue}{\gamma_{\ell,m,k}}\text{x}_\ell^k\text{x}_m^{{\color{red}{p}}-k}.\]

  • Regularized versions (Ridge, Lasso, Elastic Net) add penalties to prevent overfitting.

Classification Models:

  • Logistic Regression: \[\mathbb{P}(\hat{Y}=1|X=\text{x}) = \sigma(\color{blue}{\beta_0} + \sum_{j=1}^d\color{blue}{\beta_j}\text{x}_j).\]

  • Polynomial Logistic Regression: \[\mathbb{P}(\hat{Y}=1|X=\text{x})=\sigma(\color{blue}{\beta_0}+\sum_{j=1}^d\color{blue}{\beta_j}\text{x}_j+\sum_{k=1}^{\color{red}{p}}\sum_{\ell,m=1}^d\color{blue}{\gamma_{\ell,m,k}}\text{x}_\ell^k\text{x}_m^{{\color{red}{p}}-k}).\]

  • Regularized versions (Ridge, Lasso, Elastic Net) add penalties to prevent overfitting.

1. Linear Models

1.1. Linear Regression

mpg cylinders displacement horsepower weight acceleration model year origin car name
0 18.0 8 307.0 130 3504 12.0 70 1 chevrolet chevelle malibu
1 15.0 8 350.0 165 3693 11.5 70 1 buick skylark 320
2 18.0 8 318.0 150 3436 11.0 70 1 plymouth satellite
3 16.0 8 304.0 150 3433 12.0 70 1 amc rebel sst
4 17.0 8 302.0 140 3449 10.5 70 1 ford torino
  • Variables:
    • mpg: miles per gallon (target variable)
    • cylinders: number of cylinders
    • displacement: engine displacement
    • horsepower: engine horsepower
    • weight: vehicle weight
    • acceleration: time to accelerate from 0 to 60 mph
    • model year: year of manufacture
    • origin: origin of the car (1: USA, 2: Europe, 3: Asia).
  • Goal: Predict mpg based on other features.

1. Linear Models

1.1. Linear Regression

1.1.1. EDA

  • Data types:
mpg cylinders displacement horsepower weight acceleration model year origin
Type float64 int64 float64 object int64 float64 int64 int64
  • Q1: Is there anything wrong with column type?
  • A1: Two main problems:
    • origin is qualitative, therefore should be “category/object”.
    • ⚠️ horsepower is quantitative, therefore should be “float/int”.
  • Modifying data type:
mpg cylinders displacement horsepower weight acceleration model year origin
Type float64 int64 float64 int64 int64 float64 int64 category

1. Linear Models

1.1. Linear Regression

1.1.1. EDA: Univariate analysis

Code
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="whitegrid")
quan_vars = data.select_dtypes(include="number").columns
fig, axs = plt.subplots(2, 4, figsize=(10,4.5))
for i, va in enumerate(data.columns):
    if va in quan_vars:
        sns.histplot(data, x=va, kde=True, ax=axs[i//4, i%4], stat="proportion")
    else:
        if va != "car name":
            sns.countplot(data, x=va, ax=axs[i//4, i%4], stat="proportion")
            axs[i//4, i%4].bar_label(axs[i//4, i%4].containers[0], fmt="%.2f")
plt.tight_layout()
plt.show()

1. Linear Models

1.1. Linear Regression

1.1.1. EDA: Bivariate analysis

Code
import numpy as np
pair_grid = sns.PairGrid(data=data[quan_vars], height=0.7, aspect=2)

# Map plots to the lower triangle only
pair_grid.map_lower(sns.scatterplot)  # Scatterplots in the lower triangle
pair_grid.map_diag(sns.histplot)      # Histograms on the diagonal

def corr_func(x, y, **kws): 
    r1 = np.corrcoef(x, y)[0, 1]
    plt.gca().annotate(f"{r1:.2f}", xy=(0.5, 0.5), 
                       xycoords='axes fraction', 
                       ha='center', fontsize=20, color='#1d69d1')

pair_grid.map_upper(corr_func)
for ax in pair_grid.axes[:, 0]:  # Access the first column of axes (y-axis labels)
    ax.set_ylabel(ax.get_ylabel(), rotation=45, labelpad=20)
plt.tight_layout()
plt.show()

1. Linear Models

1.1. Linear Regression

1.1.1. EDA: Bivariate analysis

  • Does fuel-efficiency depend on the origin?
Code
_, axs = plt.subplots(1, 1, figsize=(7, 3.5))
sns.boxplot(data=data, x="origin", y="mpg", hue="origin", ax=axs)
plt.tight_layout()
plt.show()

1. Linear Models

1.1. Linear Regression

1.1.2. EDA: Summary

  • Weight shows the strongest negative correlation with mpg, followed by displacement, cylinders, and horsepower. These variables are significant in explaining variations in mpg.

  • These features are also highly correlated with each other, suggesting potential redundancy when included together in a predictive model.

  • Despite being a categorical variable, origin proves to be valuable for predicting mpg.

  • Many strongly linearly related inputs to the target mpg usefulness of Linear Models.

1. Linear Models

1.1. Linear Regression

1.1.2. Simple & Multiple LR

Simple Linear Regression

  • Model: \[\hat{y} = \color{blue}{\beta_0} + \color{blue}{\beta_1}\text{x}_1.\]
  • The best key \(\color{blue}{\vec{\widehat{\beta}}}=[\color{blue}{\beta_0},\color{blue}{\beta_1}]\) by: \[\color{blue}{\vec{\widehat{\beta}}}=\arg\min_{\color{blue}{\vec{\beta}}}\frac{1}{2n}\sum_{i=1}^n(y_i-\color{blue}{\hat{y}_i})^2\]

Multiple Linear Regression

  • Model: \[\hat{y} = \color{blue}{\beta_0} + \color{blue}{\beta_1}\text{x}_1 + \color{blue}{\beta_2}\text{x}_2 + ... + \color{blue}{\beta_d}\text{x}_d.\]
  • The best key \(\color{blue}{\vec{\widehat{\beta}}}=[\color{blue}{\beta_0},\dots,\color{blue}{\beta_d}]\) by: \[\color{blue}{\vec{\widehat{\beta}}}=\arg\min_{\color{blue}{\vec{\beta}}}\frac{1}{2n}\sum_{i=1}^n(y_i-\color{blue}{\hat{y}_i})^2\]
  • Analytic Solution: \(\color{blue}{\vec{\widehat{\beta}}}=\color{blue}{(X^TX)^{-1}X^Ty}.\)
  • Prediction: \(\hat{y}=\color{blue}{X\vec{\widehat{\beta}}}=\color{blue}{P}y\), where \(\color{blue}{P}:\) projection matrix onto \(\text{span}(X)\).

1. Linear Models

1.1. Linear Regression

1.1.2. Simple & Multiple LR

mpg weight cylinders model year
0 18.0 3504 8 70
1 15.0 3693 8 70
2 18.0 3436 8 70
  • Multiple LR: mpg vs Cyl + Year.

1. Linear Models

1.1. Linear Regression

1.1.2. Simple & Multiple LR

  • R-squared: \(R^2=1-\frac{\text{RSS}}{\text{TSS}}=1-\frac{\sum_{i=1}(y_i-\hat{y}_i)^2}{\sum_{i=1}(y_i-\overline{y}_n)^2}=\frac{\color{red}{\text{V}(\hat{Y})}}{\color{blue}{\text{V}(Y)}}.\)

  • Adjusted R-squared: \(R^2_{\text{adj}}=1-\frac{n-1}{n-d-1}(1-R^2).\)

Code
from sklearn.metrics import r2_score, mean_squared_error
def adj_r2(y_true, y_pred, d):
    n = len(y_true)
    return 1-(n-1)/(n-d-1)*(1-r2_score(y_true, y_pred))
cat = pd.get_dummies(data['origin'].astype(object), drop_first=True)*1
cat.columns = ['orig2', 'orig3']
X_full, y_full = pd.concat(
    [cat, 
     data.drop(columns=['mpg', 'car name', 'origin'])],
     axis=1), df['mpg']
lm_full = LinearRegression()
lm_full.fit(X_full, y_full)
pred_full = lm_full.predict(X_full)
X2 = data[['model year', 'cylinders']]
X1 = data[['weight']]

lm2 = LinearRegression()
lm1 = LinearRegression()
lm2.fit(X2, y_full)
lm1.fit(X1, y_full)

df_r2 = pd.DataFrame({
    'R2' : [r2_score(y_full, lm1.predict(X1)),
            r2_score(y_full, lm2.predict(X2)),
            r2_score(y_full, pred_full)],
    'Adj-R2' : [adj_r2(y_full, lm1.predict(X1), 1),
                adj_r2(y_full, lm2.predict(X2), 2),
                adj_r2(y_full, pred_full, X_full.shape[1])]
}, index=['LR1', 'LR2', 'LR-full'])
df_r2
R2 Adj-R2
LR1 0.692630 0.691842
LR2 0.715070 0.713606
LR-full 0.824199 0.820527
  • Can we do better?

1. Linear Models

1.1. Linear Regression

1.1.2. Simple & Multiple LR: \(t\)-test of coefficients

  • We can test \(H_0: \beta_j=0\) against \(H_1:\beta_j\neq 0\) using \(t\)-test.
  • If one of the two assumptions is true:
    • There are large enough observations \(n>30\)
    • Or the residuals follow Gaussian distribution with constant variance, then \(H_0\) is true, \[t_j=\frac{\beta_j}{s_{j}}\sim {\cal T}(n-d-1).\]
  • For a given level \(\alpha\), we CAN REJECT \(H_0:\beta_j=0\) if \(|t_j|>t_{\alpha/2}\) with \(\mathbb{P}(|{\cal T}(n-d-1)|\leq t_{\alpha/2})=1-\alpha\).

1. Linear Models

1.1. Linear Regression

1.1.2. Simple & Multiple LR: \(t\)-test of coefficients

import statsmodels.api as sm
model = sm.OLS(df['mpg'], sm.add_constant(df[['cylinders', 'year']]))
results = model.fit()
print(results.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                    mpg   R-squared:                       0.715
Model:                            OLS   Adj. R-squared:                  0.714
Method:                 Least Squares   F-statistic:                     488.1
Date:                Fri, 28 Nov 2025   Prob (F-statistic):          8.84e-107
Time:                        10:57:54   Log-Likelihood:                -1115.1
No. Observations:                 392   AIC:                             2236.
Df Residuals:                     389   BIC:                             2248.
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        -17.1464      4.944     -3.468      0.001     -26.866      -7.426
cylinders     -2.9981      0.132    -22.718      0.000      -3.258      -2.739
year           0.7502      0.061     12.276      0.000       0.630       0.870
==============================================================================
Omnibus:                       24.502   Durbin-Watson:                   1.290
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               31.620
Skew:                           0.513   Prob(JB):                     1.36e-07
Kurtosis:                       3.940   Cond. No.                     1.79e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.79e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

1. Linear Models

1.1. Linear Regression

1.1.3. Polynomial Features

  • Predicting target using linear form of inputs may be unrealistic!
  • More complicated forms of inputs might be better for predicting the target!
  • Ex: mpg vs weight: \[\widehat{\text{mpg}}=\color{blue}{\beta_0}+\sum_{j=1}^p\color{blue}{\beta_j}\text{weight}^j.\]

1. Linear Models

1.1. Linear Regression

1.1.3. Polynomial Features

1. Linear Models

1.1. Linear Regression

1.1.3. Polynomial Features

Code
X_full2 = pd.concat([X_full, data['weight'] ** 2], axis=1)
lm_full2 = LinearRegression()
lm_full2.fit(X_full2, y_full)

df_r2 = pd.concat([
    df_r2,
    pd.DataFrame({
    'R2' : [r2_score(y_full, lm_full2.predict(X_full2))],
    'Adj-R2' : [adj_r2(y_full, lm_full2.predict(X_full2), X_full2.shape[1])]}, 
    index=['LR-full-Poly2'])])
df_r2
R2 Adj-R2
LR1 0.692630 0.691842
LR2 0.715070 0.713606
LR-full 0.824199 0.820527
LR-full-Poly2 0.858133 0.854791
  • Can you make it even better?

1. Linear Models

1.1. Linear Regression

1.1.4. Regularization

  • High-degree polynomials often leads to overfitting (too flexible model).
  • Regularization is a common method used to controll this flexibility by controlling the magnitude of the coefficients.

1. Linear Models

1.1. Linear Regression

1.1.4. Regularization: Ridge Regression

  • Model: \(\hat{y}=\color{blue}{\beta_0}+\color{blue}{\beta_1}x_1+\dots+\color{blue}{\beta_d}x_d\),

  • Objective: Search for \(\color{blue}{\vec{\beta}=[\beta_0,\dots,\beta_d]}\) minimizing the following loss function for some \(\color{green}{\alpha}>0\): \[{\cal L}_{\text{ridge}}(\vec{\beta})=\color{red}{\underbrace{\frac{1}{n}\sum_{i=1}^n(y_i-\widehat{y}_i)^2}_{\text{MSE}}}+\color{green}{\alpha}\color{blue}{\underbrace{\sum_{j=0}^{d}\beta_j^2}_{\text{Magnitude}}}.\]

  • Recall: SLR & MLR seek to minimize only MSE.

1. Linear Models

1.1. Linear Regression

1.1.4. Regularization: Ridge Regression

  • Large \(\color{green}{\alpha}\Rightarrow\) strong penalty \(\Rightarrow\) small \(\vec{\beta}\).
  • Small \(\color{green}{\alpha}\Rightarrow\) weak penalty \(\Rightarrow\) freer \(\vec{\beta}\).
  • 🔑 Objective: Learn the best \(\color{green}{\alpha}>0\).
  • Loss: \({\cal L}_{\text{ridge}}(\vec{\beta})=\color{red}{\underbrace{\frac{1}{n}\sum_{i=1}^n(y_i-\widehat{y}_i)^2}_{\text{MSE}}}+\color{green}{\alpha}\color{blue}{\underbrace{\sum_{j=0}^{d}\beta_j^2}_{\text{Magnitude}}}.\)

1. Linear Models

1.1. Linear Regression

1.1.4. Regularization: Ridge Regression

  • Consider mpg vs polynomials of horsepower.

1. Linear Models

1.1. Linear Regression

1.1.4. Regularization: Ridge Regression

How to find a suitable regularization strength \(\color{green}{\alpha}\)?

Tuning Regularization Stregnth \(\color{green}{\alpha}\) Using \(K\)-fold Cross-Validation

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import Ridge
from sklearn.model_selection import cross_val_score
# Data
X, y = data[["horsepower"]], data['mpg']
poly = PolynomialFeatures(degree=10)
X_poly = poly.fit_transform(X)
# List of all degrees to search over
alphas = list(np.linspace(0.01, 100000, 100))
# List to store all losses
loss = []
coefficients = {f'alpha={alpha}': [] for alpha in alphas}
for alp in alphas:
    model = Ridge(alpha=alp)
    score = -cross_val_score(model, X_poly, y, cv=10, 
                scoring='neg_mean_absolute_error').mean()
    loss.append(score)
    # Fit
    model.fit(X_poly, y)
    coefficients[f'alpha={alp}'] = model.coef_

1. Linear Models

1.1. Linear Regression

1.1.4. Regularization: Ridge Regression

How to find a suitable regularization strength \(\color{green}{\alpha}\)?

Tuning Regularization Stregnth \(\color{green}{\alpha}\) Using \(K\)-fold Cross-Validation

1. Linear Models

1.1. Linear Regression

1.1.4. Regularization: Ridge Regression

Pros

  • It works well when there are inputs that are approximately linearly related with the target.
  • It helps stabilize the estimates when inputs are highly correlated.
  • It can prevent overfitting and should be used along with polynomial features.
  • It is effective when the number of inputs exceeds the number of observations.
  • Highly interpretable: coefficients are the direct influence of each term onto the target.

Cons

  • It does not work well when the input-output relationships are highly non-linear.
  • It may introduce bias into the coefficient estimates.
  • It does not perform feature selection.

1. Linear Models

1.1. Linear Regression

1.1.5. Regularization: Lasso Regression

  • Model: \(\hat{y}=\color{blue}{\beta_0}+\color{blue}{\beta_1}x_1+\dots+\color{blue}{\beta_d}x_d\),
  • Objective: Search for \(\vec{\beta}=[\beta_0,\dots,\beta_d]\) minimizing the following loss function for some \(\color{green}{\alpha}>0\): \[{\cal L}_{\text{lasso}}(\vec{\beta})=\color{red}{\underbrace{\frac{1}{n}\sum_{i=1}^n(y_i-\widehat{y}_i)^2}_{\text{MSE}}}+\color{green}{\alpha}\color{blue}{\underbrace{\sum_{j=0}^{d}|\beta_j|}_{\text{Magnitude}}}.\]

1. Linear Models

1.1. Linear Regression

1.1.5. Regularization: Lasso Regression

  • Large \(\color{green}{\alpha}\Rightarrow\) strong penalty \(\Rightarrow\) small \(\vec{\beta}\).
  • Small \(\color{green}{\alpha}\Rightarrow\) weak penalty \(\Rightarrow\) freer \(\vec{\beta}\).
  • 🔑 Objective: Learn the best \(\color{green}{\alpha}>0\).
  • Loss: \({\cal L}_{\text{lasso}}(\vec{\beta})=\color{red}{\underbrace{\frac{1}{n}\sum_{i=1}^n(y_i-\widehat{y}_i)^2}_{\text{MSE}}}+\color{green}{\alpha}\color{blue}{\underbrace{\sum_{j=0}^{d}|\beta_j|}_{\text{Magnitude}}}.\)

1. Linear Models

1.1. Linear Regression

1.1.5. Regularization: Lasso Regression

Tuning Regularization Stregnth \(\color{green}{\alpha}\) Using \(K\)-fold Cross-Validation

1. Linear Models

1.1. Linear Regression

1.1.5. Regularization: Lasso Regression

Pros

  • Lasso inherently performs feature selection when increasing regularization parameter \(\alpha\) (less important variables are forced to be completely \(0\)).
  • It works well when there are many inputs (high-dimensional data) and some highly correlated with the target.
  • It can handle collinearities (many redundant inputs).
  • It can prevent overfitting and offers high interpretability.

Cons

  • It does not work well when the input-output relationships are highly non-linear.
  • It may introduce bias into the coefficient estimates.
  • It is sensitive to the scale of the data, so proper scaling of predictors is crucial before applying the method.

1. Linear Models

1.2. Logistic Regression

1. Linear Models

1.3. Decision Trees

Interpretable Model Summary:

  • In any ML, understand the data via EDA is the main first step.
  • Insights from EDA, can guild us to suitable models and what to watch out.
  • In interpretable ML, models should be transparent and interpretable: linear & trees.
  • Refinement is done based on key problems detected in EDA step.
  • The interpretation is done directly using parameters and model structure.